May 10, 2021

Project Outline

  1. Introduction

  2. Materials and methods

  3. Results and discussion

    3.1. Exploratory data analysis

    3.2. Modeling

  4. Conclusion

Introduction

Introduction: Dataset

COVID-19 World Vaccine Adverse Reactions

  • Data from the Vaccine Adverse Event Reporting System (VAERS) created by the Food and Drug Administration (FDA) and Centers for Disease Control and Prevention (CDC)

  • Contains 3 datasets:

    1. PATIENTS.CSV

    2. VACCINES.CSV

    3. SYMPTOMS.CSV

  • Datasets connected by patient IDs (VAERS_ID)

Introduction: Dataset

COVID-19 World Vaccine Adverse Reactions

PATIENTS.CSV: Contains information about the individuals that received the vaccines

## # A tibble: 34,121 x 35
##   VAERS_ID RECVDATE  STATE AGE_YRS CAGE_YR CAGE_MO SEX   RPT_DATE   SYMPTOM_TEXT      DIED  DATEDIED L_THREAT ER_VISIT
##   <chr>    <chr>     <chr>   <dbl>   <dbl>   <dbl> <chr> <date>     <chr>             <chr> <chr>    <chr>    <chr>   
## 1 0916600  01/01/20… TX         33      33      NA F     NA         "Right side of e… <NA>  <NA>     <NA>     <NA>    
## 2 0916601  01/01/20… CA         73      73      NA F     NA         "Approximately 3… <NA>  <NA>     <NA>     <NA>    
## 3 0916602  01/01/20… WA         23      23      NA F     NA         "About 15 minute… <NA>  <NA>     <NA>     <NA>    
## # … with 34,118 more rows, and 22 more variables: HOSPITAL <chr>, HOSPDAYS <dbl>, X_STAY <chr>, DISABLE <chr>,
## #   RECOVD <chr>, VAX_DATE <chr>, ONSET_DATE <chr>, NUMDAYS <dbl>, LAB_DATA <chr>, V_ADMINBY <chr>, V_FUNDBY <chr>,
## #   OTHER_MEDS <chr>, CUR_ILL <chr>, HISTORY <chr>, PRIOR_VAX <chr>, SPLTTYPE <chr>, FORM_VERS <dbl>,
## #   TODAYS_DATE <chr>, BIRTH_DEFECT <chr>, OFC_VISIT <chr>, ER_ED_VISIT <chr>, ALLERGIES <chr>

Introduction: Dataset

COVID-19 World Vaccine Adverse Reactions

VACCINES.CSV: Contains information about the received vaccine

## # A tibble: 34,630 x 8
##    VAERS_ID VAX_TYPE VAX_MANU           VAX_LOT  VAX_DOSE_SERIES VAX_ROUTE VAX_SITE VAX_NAME                          
##    <chr>    <chr>    <chr>              <chr>    <chr>           <chr>     <chr>    <chr>                             
##  1 0916600  COVID19  "MODERNA"          037K20A  1               IM        LA       COVID19 (COVID19 (MODERNA))       
##  2 0916601  COVID19  "MODERNA"          025L20A  1               IM        RA       COVID19 (COVID19 (MODERNA))       
##  3 0916602  COVID19  "PFIZER\\BIONTECH" EL1284   1               IM        LA       COVID19 (COVID19 (PFIZER-BIONTECH…
##  4 0916603  COVID19  "MODERNA"          unknown  <NA>            <NA>      <NA>     COVID19 (COVID19 (MODERNA))       
##  5 0916604  COVID19  "MODERNA"          <NA>     1               IM        LA       COVID19 (COVID19 (MODERNA))       
##  6 0916606  COVID19  "MODERNA"          011J20A  1               IM        LA       COVID19 (COVID19 (MODERNA))       
##  7 0916607  COVID19  "MODERNA"          <NA>     <NA>            IM        LA       COVID19 (COVID19 (MODERNA))       
##  8 0916608  COVID19  "MODERNA"          <NA>     1               IM        LA       COVID19 (COVID19 (MODERNA))       
##  9 0916609  COVID19  "MODERNA"          011J201A 1               IM        LA       COVID19 (COVID19 (MODERNA))       
## 10 0916610  COVID19  "MODERNA"          <NA>     1               SYR       LA       COVID19 (COVID19 (MODERNA))       
## # … with 34,620 more rows

Introduction: Dataset

COVID-19 World Vaccine Adverse Reactions

SYMPTOMS.CSV: Contains information about the symptoms experienced after vaccination

## # A tibble: 48,110 x 11
##   VAERS_ID SYMPTOM1    SYMPTOMVERSION1 SYMPTOM2   SYMPTOMVERSION2 SYMPTOM3   SYMPTOMVERSION3 SYMPTOM4  SYMPTOMVERSION4
##   <chr>    <chr>                 <dbl> <chr>                <dbl> <chr>                <dbl> <chr>               <dbl>
## 1 0916600  Dysphagia              23.1 Epiglotti…            23.1 <NA>                  NA   <NA>                 NA  
## 2 0916601  Anxiety                23.1 Dyspnoea              23.1 <NA>                  NA   <NA>                 NA  
## 3 0916602  Chest disc…            23.1 Dysphagia             23.1 Pain in e…            23.1 Visual i…            23.1
## 4 0916603  Dizziness              23.1 Fatigue               23.1 Mobility …            23.1 <NA>                 NA  
## 5 0916604  Injection …            23.1 Injection…            23.1 Injection…            23.1 Injectio…            23.1
## 6 0916606  Pharyngeal…            23.1 <NA>                  NA   <NA>                  NA   <NA>                 NA  
## # … with 48,104 more rows, and 2 more variables: SYMPTOM5 <chr>, SYMPTOMVERSION5 <dbl>

Introduction: Aim

The aim of this project is to gain insight on the adverse effects of different Covid-19 vaccines and answer questions such as:

  • Do some vaccines cause more/different symptoms than others?

  • Do patients with some profiles get more/different symptoms?

  • Are certain symptoms correlated with death?

  • Is patient profile correlated with death?

  • Does taking anti-inflammatory drugs reduce the chance of having symptoms?

Methods

Methods: Project workflow

Methods: Challenges and solutions

Cleaning

  • NAs that should be interpreted as “no” → replace_na(ALLERGIES = “N”)
  • Duplicated IDs → add_count(VAERS_ID) %>% filter(n == 1) %>% select(-n)

Augment

  • Columns containing long string descriptions → Make tidy categorical (Y/N) variables
## # A tibble: 3 x 3
##   VAERS_ID OTHER_MEDS                     TAKES_ANTIINFLAMATORY
##   <chr>    <chr>                          <chr>                
## 1 0916983  <NA>                           N                    
## 2 0916988  Ibuprofen  PM the night before Y                    
## 3 0916996  Clobetasol, Benadryl           N
  • Too many symptoms and untidy → extract top 20 occurring symptoms and turn them into tidy categorical (TRUE/FALSE) columns

Results and Discussion

Exploratory Data Analysis

Visualization

Visualization

Group representation: Age, sex and manufacturer

## # A tibble: 3 x 2
##   SEX       n
##   <chr> <int>
## 1 F     24070
## 2 M      8514
## 3 <NA>    828
## # A tibble: 3 x 2
##   VAX_MANU            n
##   <chr>           <int>
## 1 JANSSEN          1106
## 2 MODERNA         16253
## 3 PFIZER-BIONTECH 16053

Visualization

When do symptoms appear?

Hypothesis: two peaks corresponding to the innate and acquired immune response. Stronger innate response in younger individuals.

Visualization

Does age/sex impact the number of symptoms?

Observations: no noticeable differences.

Visualization

Does vaccine type impact the number of symptoms?

Observations: no noticeable differences.

Visualization

Do symptoms differ among age groups?

Hypothesis: symptoms associated to a strong immune response are most common in younger groups.

Visualization

Do symptoms differ in males and females?

Observations: some noticeable differences, hereamongst death, which is more common in males.

Visualization

Do different vaccines cause different symptoms?

Observations: Janssen causes the most common symptoms more often (NB smaller sample size)

Exploratory Data Analysis

Principal Component Analysis

PCA

Do vaccine types cluster together?

Observations: no observable clustering.

Modelling

Logistic Regressions

Logistic Regression

Is the patient’s profile correlated with death?

glm(death ~ sex + age + allergic/not + ill/not + has/had covid/not, family = binomial)

Observations: Age, illness and being male are positively correlated with death (p-value < 0.05).

Logistic Regression

Are some symptoms correlated with death?

glm(death ~ symptoms, family = binomial)

Observations: asthenia, dyspnoea and vomiting are positively correlated with death (p-value < 0.05).

Many Logistic Regressions

Do anti-inflammatory drugs reduce symptoms?

glm(each symptom ~ takes anti-inflammatory/not , family = binomial)

Observations: no significant reduction in symptoms (p-value < 0.05). Possible increase?

Modelling

Proportion tests

Proportion tests

Does the proportion of death differ amongst vaccine types?

## # A tibble: 2 x 4
##   DIED  JANSSEN MODERNA `PFIZER-BIONTECH`
##   <chr>   <dbl>   <dbl>             <dbl>
## 1 N        1090   15281             15212
## 2 Y          16     972               841

Observations: there is a significant difference among vaccine types (p-value < 0.05).

Proportion tests

Does the proportion of death differ between males and females?

## # A tibble: 2 x 3
##   DIED      F     M
##   <chr> <dbl> <dbl>
## 1 N     23271  7523
## 2 Y       799   991

Observations: death is significantly more common in males (p-value < 0.05).

Conclusion

Uneven group sizes difficults interpretation of results. Knowing this, we observed the following:

  1. Most vaccine recipients experience symptoms immediately.

  2. Symptoms associated to the expected immune response happen more often in younger age groups.

  3. Age, illness and being male are positively correlated with death.

  4. Asthenia, dyspnoea and vomiting are positively correlated with death.

  5. Anti-inflammatory drugs do not reduce symptoms.

  6. The proportion of death varies among vaccine types.

  7. The proportion of death is higher in males.

References